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Instructions 

Read each question carefully and write your answer legibly on the examination paper. No other 
paper will be accepted. You may use the backs of pages for rough work but all final answers must 
be in the spaces provided. The marks for each question are as indicated. Allocate your time 
accordingly. 
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your name is on every page. 

Note: a reference table of MIPS instructions is provided at the end of the examination paper. 
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1. General (10 marks ) Give the technical term that best fits each of the following descriptions or 
definitions. 

(a) The type of locality exhibited by a program that sequentially reads the elements of an array. 


(b) A number in scientific notation, such that the single digit to the left of the decimal point is 
non-zero. 


(c) A style of instruction set architecture in which only a single register is available and all 
arithmetic is performed using that single register. 


(d) A field in each entry in a processor cache that contains the address information required to 
determine whether the block stored there is the one being searched for. 


(e) A numbering system that uses only the digits 0-7. 


(f) The breakdown of a machine language instruction into fields with particular sizes and 
meanings. 


(g) Information that tells the linker which words must be changed when an object file is shifted in 
the address space. 


(h) With this MIPS assembly language instruction, the first operand is assigned the value one if 
the second operand is less than the third operand, and zero otherwise. 


(i) A type of machine language branch instruction for which some fixed number of sequentially 
following instructions are executed regardless of the outcome of the branch test. 


(j) In a computer system using virtual memory, the result of attempting to access a page that is 
not resident in main memory. 



Name: 


Page 3 


Student Number: 


2. Computer Performance (16 marks in total ) 

(a) (8 marks) Give a precise explanation for why each of the following scenarios is either 
impossible, or, at least, highly unlikely. 

(i) Use of a new compiler results in a higher MIPS value, a lower instruction count, 
and a higher CPU execution time, for some particular application and system. 


(ii) After increasing the clock rate of a particular processor (keeping everything else 
fixed), the MIPS value for a certain application increases by 50%, and the CPU 
execution time decreases by 50%. 


(iii) A new, complex machine language instruction is implemented for an operation 
that was previously done in software (i.e., with a sequence of simpler instructions), 
without impacting the implementation of the existing instructions or their 
execution times. The MIPS value for a particular application that makes heavy use 
of the operation is found to increase. 


(iv) An application is ported to a processor with a substantially different instruction set 
architecture, but the same clock rate. Both the average CPI and the CPU execution 
time decrease by 28%. 
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(b) (2 marks ) Why might it be a bad idea, when comparing systems using multiple benchmarks, 
to use the sum of the benchmark execution times on each system as the performance metric? 


(c) (2 marks ) Assuming CISC and RISC processors of similar age, development cost, and clock 
rate, which would you expect to have a higher CPI? Explain. 


(d) (2 marks) Consider a system with two classes of instructions A and B. Class A instructions 
have a CPI of 3, while class B instructions have a CPI of 6. The clock rate is 500Mhz. A 
particular application executes 100 million class A instructions, and Nb class B instructions. 
Write a formula for the CPU execution time, as a function of Nb- 


(e) (2 marks) Consider a system with two classes of instructions A and B. Class A instructions 
have a CPI of 3, while class B instructions have a CPI of 6. Suppose that the CPI for class A 
instructions could be decreased to 2, at the cost of increasing the clock cycle time by 20%. 
For what range of values of the fraction of instructions of type A, /a, would the program 
execution time decrease? 
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3. Arithmetic (16 marks in total ) 

(a) (2 marks ) Recall that in the IEEE 754 floating point standard, single precision floating point 
numbers have a 1 bit sign field, followed by an 8 bit exponent field (in biased notation with a 
bias of 127), followed by a 23 bit field giving the digits of the fractional part. Give the 
number (in base 10) that is represented by 11000000011100000000000000000000. 


(b) (3 marks ) Give a truth table for a logic function whose 3 inputs are the binary digits of a 3 bit 
2’s complement number, and whose 3 outputs give the binary digits of the absolute value of 
that number. State any required assumptions. 


(c) (8 marks ) As functions of n, give the range of integers that can represented in: 


(i) 77 bit 2’s complement 


(ii) 77 bit biased notation with bias of 2”' 1 -! 


(iii) 77 bit l’s complement 

(iv) 


77 bit sign-magnitude 
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(d) (3 marks ) Calculate 11110 x 11011 using Booth’s algorithm. Clearly show each step. 


4. Machine and Assembly Language (20 marks in total ) 

(a) (5 marks ) For each of the five MIPS addressing modes, give an example instruction that 
utilizes that mode. In each case, describe how the respective addressing mode works. 
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(b) (5 marks ) List three examples of the types of items that might be found in a procedure call 
frame. 


(c) (6 marks ) Consider a linked list data structure in which each node is implemented with two 
consecutive words of memory. The first word of each node contains an integer value. The 
second word contains the memory address of (the first word of) the next node in the list. 
Assume that a memory location with label head contains the memory address of the first node 
in the list, and that a memory address value of zero indicates the end of the list. Write a MIPS 
procedure found that takes as its argument an integer value n, and returns 1 if there is a node 
in the linked list that contains that value, and 0 otherwise. (You do NOT need to write a main 
program. Assume the procedure calling conventions used throughout the course.) 
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(d) (6 marks ) Translate the following pseudo-code into an equivalent sequence of MIPS 
assembly language instructions, assuming that register $s0 corresponds to the integer variable 
“i”, register $sl holds the base address of the integer array A (with elements indexed starting 
from 0), and register $s2 corresponds to the integer variable “N”. Clearly identify the purpose 
of any other variables or registers that you may create or use. 

for (i = 0; i < N; i++){ 

if (A[i] < 0) A[i] = A[i]*2; 

} 


5. Datapath and Control (20 marks in total ) 

(a) (5 marks ) List the three basic types of pipeline hazards. 


(b) (2 marks ) Give an example sequence of outcomes for a particular branch (e.g., “Taken, Not 
Taken, ...”), of length at least 4, such that the 2-bit branch prediction scheme discussed in 
class will always make the wrong prediction, assuming that the last two outcomes for this 
branch prior to your sequence were both “Taken”. 
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(c) (4 marks ) Determine when each of the instructions in the sequence given below will be 
completed, assuming a 5-stage pipeline without forwarding. The five stages and their 
functions are given as follows: 


• IF 

• ID 

• EX 

• MEM 

• WB 


- instruction fetch (from instruction memory) 

- instruction decode and register file read 

- execute or address calculation 

- memory access (from/to data memory) 

- write back to register file 


Assume that if one instruction reads a register during the same clock cycle as another 
instruction is writing it, the new value will be read. Note that an instruction takes 5 clock 
cycles to complete (if not stalled because of a hazard). Do not reorder the instructions. 
Instructions are fetched and executed exactly in the order given below. 


Instruction sequence Clock cycle at which instruction is completed 

(i.e., # clock cycles after start of instruction sequence) 

add $s6,$sl,$s3 _ 

add $sl,$s2,$s3 _ 

lw $s5,0($sl) _ 

lw $tl,0($s5) _ 


(d) (4 marks ) Repeat part (c), but now assume that the pipeline does use forwarding. 

Instruction sequence Clock cycle at which instruction is completed 

(i.e., #clock cycles after start of instruction sequence) 


add 

$s6,$sl,$s3 

add 

$sl,$s2,$s3 

lw 

$s5,0($sl) 

lw 

$tl,0($s5) 
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(e) (4 marks ) Consider the multiple clock cycle and the pipelined datapaths discussed in class. 
Suppose that new implementation technology allows everything to be speeded up 
substantially, except for memory access. Specifically, the clock rate can be doubled with the 
new technology, but only if each memory access is given 2 clock cycles to complete rather 
than just 1 clock cycle. What would be the impact on performance for each of the multiple 
clock cycle datapath and the pipelined datapath? Be as precise as possible in your answer, and 
state any required assumptions. 


(f) (3 marks ) Assuming the multiple clock cycle datapath described in class, describe what 
relevant actions take place during each of the clock cycles required for a “beq” instruction. 
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6. Cache and Virtual Memory (18 marks in total) 

(a) (6 marks ) Consider a cache with space to store 64 blocks. 

(i) Assuming the cache is set associative with each set having space for 4 blocks, in 
which set would memory block number 40 be stored? 


(ii) If the block size is 8 words, which memory block does the location with (byte) 
address 100 belong to? 


(iii) Assuming instead that the cache is direct mapped, and there are 2 words in a block, 
give a formula for the cache position that would be checked on a reference to 
(byte) address N. 


(b) (2 marks ) Assuming a fixed total cache capacity (in bytes), describe the advantages and 
disadvantages of using a large block size, in comparison to a small block size. 


(c) (2 marks ) What is a TLB, and why is it needed? 
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(d) (4 marks ) Outline how writes (i.e., stores) are handled in systems with processor caches. 


(e) (4 marks ) Consider the following portion of a page table from a system with 8K byte pages. 
All values are given in decimal. 


virtual page number 

physical page frame number 

0 

1513 

1 

1 

2 

3 

3 

2 

4 

929 

5 

0 


(i) Which virtual page contains the word with (decimal) virtual (byte) address 20000? 


(ii) What is the page offset for this word? 


(iii) In which physical page frame is it contained? 


(iv) What is the word’s physical memory address? 


(The End) 




